data<-read.csv("auto.csv", header= TRUE)
The missing values are the first 22 obs, so we delete them
data <- data[-(1:22),]
Now we create a time series object and plot it
Ita_ts <- ts(data$Italy, start = c(1990, 1),
end = c(2024, 2), frequency =12)
plot(Ita_ts)
As we can see, there is a strong stagional component and a structural break around March 2020 (due to the Covid epidemy)
kpss.test(Ita_ts)
## Warning in kpss.test(Ita_ts): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Ita_ts
## KPSS Level = 2.267, Truncation lag parameter = 5, p-value = 0.01
adf.test(Ita_ts)
##
## Augmented Dickey-Fuller Test
##
## data: Ita_ts
## Dickey-Fuller = -3.3571, Lag order = 7, p-value = 0.06132
## alternative hypothesis: stationary
Even if the ADF-Test can suggest stationarity for certain values of alfa, the KPPS suggests that the series is non stationary.
acf(Ita_ts)
pacf(Ita_ts)
Box.test(Ita_ts, lag = 12, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: Ita_ts
## X-squared = 821.85, df = 12, p-value < 2.2e-16
So there is autocorrelation
dec<-decompose(Ita_ts)
plot(dec)
We calculate the series trend by using a moving average smoother. The order of the smoother should be derived from the frequency of the series. The general rule of thumb is to use two sides moving average with an order of frequency / 2. For example, in the case, as the frequency is monthly or 12, we will use two sides moving average, where each side is 6 consecutive observations. This means that for calculating the smoothed value of the t observation - we will average the t observation along with the previous and following 6 observations (i.e., averaging 13 data points).
library(plotly)
## Warning: il pacchetto 'plotly' è stato creato con R versione 4.3.3
##
## Caricamento pacchetto: 'plotly'
## Il seguente oggetto è mascherato da 'package:MASS':
##
## select
## Il seguente oggetto è mascherato da 'package:ggplot2':
##
## last_plot
## Il seguente oggetto è mascherato da 'package:stats':
##
## filter
## Il seguente oggetto è mascherato da 'package:graphics':
##
## layout
library(dplyr)
##
## ######################### Warning from 'xts' package ##########################
## # #
## # The dplyr lag() function breaks how base R's lag() function is supposed to #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or #
## # source() into this session won't work correctly. #
## # #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop #
## # dplyr from breaking base R's lag() function. #
## # #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning. #
## # #
## ###############################################################################
##
## Caricamento pacchetto: 'dplyr'
## Il seguente oggetto è mascherato da 'package:MASS':
##
## select
## I seguenti oggetti sono mascherati da 'package:xts':
##
## first, last
## I seguenti oggetti sono mascherati da 'package:plm':
##
## between, lag, lead
## Il seguente oggetto è mascherato da 'package:car':
##
## recode
## I seguenti oggetti sono mascherati da 'package:stats':
##
## filter, lag
## I seguenti oggetti sono mascherati da 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
## Warning: il pacchetto 'lubridate' è stato creato con R versione 4.3.3
##
## Caricamento pacchetto: 'lubridate'
## I seguenti oggetti sono mascherati da 'package:base':
##
## date, intersect, setdiff, union
library(TSstudio)
## Warning: il pacchetto 'TSstudio' è stato creato con R versione 4.3.3
smooth <- ts_ma(Ita_ts, n = 6,
separate = FALSE)
Let’s plot the series with the trend estimation:
smooth$plot %>%
layout(legend = list(x = 0.1, y = 0.9))
Next step, we will convert the series and smoothed trend into a dataframe object with the ts_to_prophet function and merge the two series:
df <- ts_to_prophet(Ita_ts) %>%
select(date = ds, y) %>%
left_join(ts_to_prophet(smooth$ma_6) %>%
select(date = ds, trend = y), by = "date")
head(df, 8)
## date y trend
## 1 1990-01-01 310337 NA
## 2 1990-02-01 219117 NA
## 3 1990-03-01 225059 NA
## 4 1990-04-01 214460 NA
## 5 1990-05-01 232627 NA
## 6 1990-06-01 201085 NA
## 7 1990-07-01 206843 200193.7
## 8 1990-08-01 107075 192300.4
Note: the cost of using the moving average for trend estimation is lost of the first and last n observation. Where n is the order of the moving average, in this case, is the first and last 6 observations.
Next, we will remove the trend from the series by subtracting the trend estimation from the series:
df$detrend <- df$y - df$trend
head(df, 8)
## date y trend detrend
## 1 1990-01-01 310337 NA NA
## 2 1990-02-01 219117 NA NA
## 3 1990-03-01 225059 NA NA
## 4 1990-04-01 214460 NA NA
## 5 1990-05-01 232627 NA NA
## 6 1990-06-01 201085 NA NA
## 7 1990-07-01 206843 200193.7 6649.308
## 8 1990-08-01 107075 192300.4 -85225.385
And plot
ts_plot(df,
title = "Car Registration Detrending") %>%
layout(legend = list(x = 0.1, y = 0.9))
We can now overlap in a plot the various years:
df$year <- year(df$date)
df$month <- month(df$date)
p <- plot_ly()
for(i in unique(df$year)){
temp <- NULL
temp <- df %>% filter(year == i)
p <- p %>% add_lines(x = temp$month,
y = temp$detrend,
name = i)
}
p
Now, let’s calculate the seasonal component and add it to the seasonal plot above:
seasonal_comp <- df %>%
group_by(month) %>%
summarise(month_avg = mean(detrend, na.rm = TRUE),
.groups = "drop")
p %>% add_lines(x = seasonal_comp$month,
y = seasonal_comp$month_avg,
line = list(color = "black", dash = "dash", width = 4),
name = "Seasonal Component")
To calculate the irregular component, we will have to merge the seasonal component back to then to subtract from the series the estimated trend and seasonal components:
df <- df %>% left_join(seasonal_comp, by = "month")
df$irregular <- df$y - df$trend - df$month_avg
head(df)
## date y trend detrend year month month_avg irregular
## 1 1990-01-01 310337 NA NA 1990 1 32983.627 NA
## 2 1990-02-01 219117 NA NA 1990 2 23201.014 NA
## 3 1990-03-01 225059 NA NA 1990 3 38937.100 NA
## 4 1990-04-01 214460 NA NA 1990 4 7643.473 NA
## 5 1990-05-01 232627 NA NA 1990 5 21652.772 NA
## 6 1990-06-01 201085 NA NA 1990 6 19034.531 NA
ts_plot(df[, c("date", "y" ,"trend", "detrend", "month_avg", "irregular")],
title = "Car Registration and its Components",
type = "multiple")
Naive model using the mean of the model
avg_model <- Arima(Ita_ts, c(0,0,0))
avg_forecast <- forecast(avg_model)
avg_forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Mar 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Apr 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## May 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Jun 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Jul 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Aug 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Sep 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Oct 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Nov 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Dec 2024 163158.3 99609.21 226707.4 65968.35 260348.3
## Jan 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Feb 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Mar 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Apr 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## May 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Jun 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Jul 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Aug 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Sep 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Oct 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Nov 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Dec 2025 163158.3 99609.21 226707.4 65968.35 260348.3
## Jan 2026 163158.3 99609.21 226707.4 65968.35 260348.3
## Feb 2026 163158.3 99609.21 226707.4 65968.35 260348.3
plot(avg_forecast)
avg_forecast$mean
## Jan Feb Mar Apr May Jun Jul Aug
## 2024 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3
## 2025 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3 163158.3
## 2026 163158.3 163158.3
## Sep Oct Nov Dec
## 2024 163158.3 163158.3 163158.3 163158.3
## 2025 163158.3 163158.3 163158.3 163158.3
## 2026
Naive model using a RW
RW<-naive(Ita_ts,50)
RW
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Mar 2024 147094 83319.434 210868.6 49559.218 244628.8
## Apr 2024 147094 56903.144 237284.9 9158.989 285029.0
## May 2024 147094 36633.212 257554.8 -21841.197 316029.2
## Jun 2024 147094 19544.868 274643.1 -47975.563 342163.6
## Jul 2024 147094 4489.736 289698.3 -71000.402 365188.4
## Aug 2024 147094 -9121.145 303309.1 -91816.447 386004.4
## Sep 2024 147094 -21637.641 315825.6 -110958.776 405146.8
## Oct 2024 147094 -33287.712 327475.7 -128776.022 422964.0
## Nov 2024 147094 -44229.697 338417.7 -145510.345 439698.3
## Dec 2024 147094 -54578.885 348766.9 -161338.061 455526.1
## Jan 2025 147094 -64422.306 358610.3 -176392.275 470580.3
## Feb 2025 147094 -73827.576 368015.6 -190776.394 484964.4
## Mar 2025 147094 -82848.467 377036.5 -204572.656 498760.7
## Apr 2025 147094 -91528.575 385716.6 -217847.736 512035.7
## May 2025 147094 -99903.831 394091.8 -230656.585 524844.6
## Jun 2025 147094 -108004.263 402192.3 -243045.126 537233.1
## Jul 2025 147094 -115855.271 410043.3 -255052.207 549240.2
## Aug 2025 147094 -123478.568 417666.6 -266711.033 560899.0
## Sep 2025 147094 -130892.887 425080.9 -278050.256 572238.3
## Oct 2025 147094 -138114.529 432302.5 -289094.804 583282.8
## Nov 2025 147094 -145157.775 439345.8 -299866.520 594054.5
## Dec 2025 147094 -152035.228 446223.2 -310384.677 604572.7
## Jan 2026 147094 -158758.073 452946.1 -320666.380 614854.4
## Feb 2026 147094 -165336.289 459524.3 -330726.894 624914.9
## Mar 2026 147094 -171778.829 465966.8 -340579.908 634767.9
## Apr 2026 147094 -178093.755 472281.8 -350237.755 644425.8
## May 2026 147094 -184288.364 478476.4 -359711.592 653899.6
## Jun 2026 147094 -190369.282 484557.3 -369011.553 663199.6
## Jul 2026 147094 -196342.547 490530.5 -378146.873 672334.9
## Aug 2026 147094 -202213.683 496401.7 -387126.000 681314.0
## Sep 2026 147094 -207987.755 502175.8 -395956.681 690144.7
## Oct 2026 147094 -213669.423 507857.4 -404646.044 698834.0
## Nov 2026 147094 -219262.988 513451.0 -413200.663 707388.7
## Dec 2026 147094 -224772.425 518960.4 -421626.620 715814.6
## Jan 2027 147094 -230201.419 524389.4 -429929.550 724117.5
## Feb 2027 147094 -235553.395 529741.4 -438114.690 732302.7
## Mar 2027 147094 -240831.539 535019.5 -446186.915 740374.9
## Apr 2027 147094 -246038.826 540226.8 -454150.773 748338.8
## May 2027 147094 -251178.036 545366.0 -462010.516 756198.5
## Jun 2027 147094 -256251.769 550439.8 -469770.122 763958.1
## Jul 2027 147094 -261262.468 555450.5 -477433.324 771621.3
## Aug 2027 147094 -266212.424 560400.4 -485003.629 779191.6
## Sep 2027 147094 -271103.794 565291.8 -492484.334 786672.3
## Oct 2027 147094 -275938.612 570126.6 -499878.549 794066.5
## Nov 2027 147094 -280718.793 574906.8 -507189.205 801377.2
## Dec 2027 147094 -285446.150 579634.1 -514419.074 808607.1
## Jan 2028 147094 -290122.395 584310.4 -521570.774 815758.8
## Feb 2028 147094 -294749.153 588937.2 -528646.789 822834.8
## Mar 2028 147094 -299327.960 593516.0 -535649.471 829837.5
## Apr 2028 147094 -303860.279 598048.3 -542581.055 836769.1
plot(RW)
Naive model using RW with drift
RWD<-rwf(Ita_ts,50, drift = TRUE)
RWD
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Mar 2024 146694.9 82766.244 210623.5 48924.473 244465.3
## Apr 2024 146295.7 55776.825 236814.7 7859.002 284732.5
## May 2024 145896.6 34899.247 256894.0 -23859.212 315652.4
## Jun 2024 145497.5 17173.316 273821.7 -50757.399 341752.4
## Jul 2024 145098.4 1453.987 288742.7 -74586.754 364783.5
## Aug 2024 144699.2 -12845.222 302243.7 -96244.221 385642.7
## Sep 2024 144300.1 -26072.252 314672.5 -116261.933 404862.2
## Oct 2024 143901.0 -38453.515 326255.5 -134986.156 422788.1
## Nov 2024 143501.9 -50146.073 337149.8 -152657.094 439660.8
## Dec 2024 143102.7 -61264.131 347469.6 -169449.411 455654.9
## Jan 2025 142703.6 -71893.794 357301.0 -185494.794 470902.0
## Feb 2025 142304.5 -82101.870 366710.8 -200895.413 485504.4
## Mar 2025 141905.3 -91941.415 375752.1 -215732.413 499543.1
## Apr 2025 141506.2 -101455.392 384467.8 -230071.501 513083.9
## May 2025 141107.1 -110679.162 392893.3 -243966.753 526180.9
## Jun 2025 140708.0 -119642.234 401058.2 -257463.305 538879.2
## Jul 2025 140308.8 -128369.541 408987.2 -270599.283 551217.0
## Aug 2025 139909.7 -136882.362 416701.8 -283407.235 563226.7
## Sep 2025 139510.6 -145199.038 424220.2 -295915.207 574936.4
## Oct 2025 139111.5 -153335.498 431558.4 -308147.564 586370.5
## Nov 2025 138712.3 -161305.686 438730.3 -320125.630 597550.3
## Dec 2025 138313.2 -169121.887 445748.3 -331868.193 608494.6
## Jan 2026 137914.1 -176794.988 452623.1 -343391.903 619220.1
## Feb 2026 137514.9 -184334.692 459364.6 -354711.600 629741.5
## Mar 2026 137115.8 -191749.688 465981.3 -365840.573 640072.2
## Apr 2026 136716.7 -199047.794 472481.2 -376790.778 650224.2
## May 2026 136317.6 -206236.072 478871.2 -387573.016 660208.2
## Jun 2026 135918.4 -213320.928 485157.8 -398197.082 670034.0
## Jul 2026 135519.3 -220308.191 491346.8 -408671.893 679710.5
## Aug 2026 135120.2 -227203.184 497443.6 -419005.590 689246.0
## Sep 2026 134721.1 -234010.785 503452.9 -429205.632 698647.7
## Oct 2026 134321.9 -240735.473 509379.3 -439278.869 707922.7
## Nov 2026 133922.8 -247381.374 515227.0 -449231.613 717077.2
## Dec 2026 133523.7 -253952.299 520999.7 -459069.691 726117.0
## Jan 2027 133124.6 -260451.775 526700.9 -468798.497 735047.6
## Feb 2027 132725.4 -266883.075 532333.9 -478423.036 743873.9
## Mar 2027 132326.3 -273249.240 537901.8 -487947.960 752600.6
## Apr 2027 131927.2 -279553.104 543407.4 -497377.603 761231.9
## May 2027 131528.0 -285797.312 548853.4 -506716.010 769772.1
## Jun 2027 131128.9 -291984.336 554242.2 -515966.961 778224.8
## Jul 2027 130729.8 -298116.490 559576.1 -525133.997 786593.6
## Aug 2027 130330.7 -304195.948 564857.3 -534220.440 794881.8
## Sep 2027 129931.5 -310224.749 570087.8 -543229.410 803092.5
## Oct 2027 129532.4 -316204.813 575269.6 -552163.843 811228.7
## Nov 2027 129133.3 -322137.947 580404.5 -561026.504 819293.1
## Dec 2027 128734.2 -328025.858 585494.2 -569820.002 827288.3
## Jan 2028 128335.0 -333870.157 590540.2 -578546.801 835216.8
## Feb 2028 127935.9 -339672.367 595544.2 -587209.230 843081.0
## Mar 2028 127536.8 -345433.931 600507.5 -595809.496 850883.0
## Apr 2028 127137.6 -351156.216 605431.5 -604349.690 858625.0
plot(RWD)
fit <- ets(Ita_ts)
forecast<-forecast(fit)
forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Mar 2024 159519.97 133226.90 185813.03 119308.19 199731.7
## Apr 2024 126371.71 104783.25 147960.17 93355.01 159388.4
## May 2024 156823.30 129121.63 184524.97 114457.25 199189.3
## Jun 2024 156618.45 128071.02 185165.88 112958.92 200278.0
## Jul 2024 135863.34 110356.00 161370.67 96853.24 174873.4
## Aug 2024 80259.44 64764.32 95754.56 56561.70 103957.2
## Sep 2024 134571.92 107893.49 161250.35 93770.78 175373.1
## Oct 2024 136761.20 108957.07 164565.34 94238.45 179284.0
## Nov 2024 133740.30 105889.55 161591.05 91146.25 176334.3
## Dec 2024 111937.26 88086.01 135788.51 75459.92 148414.6
## Jan 2025 141610.32 110766.54 172454.09 94438.84 188781.8
## Feb 2025 147309.39 114541.48 180077.29 97195.20 197423.6
## Mar 2025 159650.56 122132.39 197168.73 102271.47 217029.7
## Apr 2025 126475.17 96206.62 156743.71 80183.42 172766.9
## May 2025 156951.68 118722.24 195181.13 98484.79 215418.6
## Jun 2025 156746.67 117911.53 195581.80 97353.46 216139.9
## Jul 2025 135974.56 101725.91 170223.22 83595.76 188353.4
## Aug 2025 80325.15 59767.35 100882.94 48884.72 111765.6
## Sep 2025 134682.09 99674.11 169690.06 81142.01 188222.2
## Oct 2025 136873.16 100756.05 172990.27 81636.81 192109.5
## Nov 2025 133849.79 98009.92 169689.65 79037.45 188662.1
## Dec 2025 112028.90 81601.88 142455.92 65494.79 158563.0
## Jan 2026 141726.25 102696.44 180756.05 82035.31 201417.2
## Feb 2026 147429.98 106277.74 188582.22 84493.06 210366.9
plot(forecast)
summary(fit)
## ETS(M,N,M)
##
## Call:
## ets(y = Ita_ts)
##
## Smoothing parameters:
## alpha = 0.2702
## gamma = 0.1832
##
## Initial states:
## l = 187481.9588
## s = 0.6401 0.8632 0.9606 0.826 0.5217 1.1355
## 1.0948 1.1422 1.1025 1.1927 1.085 1.4357
##
## sigma: 0.1286
##
## AIC AICc BIC
## 10607.99 10609.21 10668.23
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -1158.965 20603.44 13158.38 -9.244998 16.1703 0.6146821 0.228454
s.exp1 <- HoltWinters(Ita_ts, alpha = 0.1, beta=FALSE, gamma=FALSE, l.start=23.56)
s.exp7 <- HoltWinters(Ita_ts, alpha = 0.7, beta=FALSE, gamma=FALSE, l.start=23.56)
s.exp.estimated <- HoltWinters(Ita_ts)
s.exp.estimated$alpha
## alpha
## 0.4469785
forecast(s.exp.estimated)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Mar 2024 157737.23 130845.81 184628.6 116610.360 198864.1
## Apr 2024 126985.53 97530.04 156441.0 81937.248 172033.8
## May 2024 157404.37 125590.80 189217.9 108749.714 206059.0
## Jun 2024 157925.04 123916.50 191933.6 105913.461 209936.6
## Jul 2024 137265.82 101195.64 173336.0 82101.232 192430.4
## Aug 2024 87520.53 49500.33 125540.7 29373.644 145667.4
## Sep 2024 134439.86 94564.88 174314.8 73456.345 195423.4
## Oct 2024 135350.49 93703.27 176997.7 71656.563 199044.4
## Nov 2024 132305.89 88958.82 175653.0 66012.263 198599.5
## Dec 2024 109167.73 64184.99 154150.5 40372.573 177962.9
## Jan 2025 135724.49 89163.53 182285.5 64515.637 206933.4
## Feb 2025 141919.83 93832.40 190007.3 68376.448 215463.2
## Mar 2025 152633.44 100908.05 204358.8 73526.290 231740.6
## Apr 2025 121881.74 68778.13 174985.3 40666.782 203096.7
## May 2025 152300.58 97853.63 206747.5 69031.156 235570.0
## Jun 2025 152821.25 97063.31 208579.2 67546.834 238095.7
## Jul 2025 132162.03 75123.23 189200.8 44928.704 219395.4
## Aug 2025 82416.74 24125.20 140708.3 -6732.470 171566.0
## Sep 2025 129336.07 69818.16 188854.0 38311.289 220360.8
## Oct 2025 130246.70 69527.20 190966.2 37384.234 223109.2
## Nov 2025 127202.10 65304.32 189099.9 32537.607 221866.6
## Dec 2025 104063.94 41009.89 167118.0 7631.087 200496.8
## Jan 2026 130620.71 66431.21 194810.2 32451.347 228790.1
## Feb 2026 136816.05 71510.85 202121.2 36940.366 236691.7
plot(s.exp.estimated)
s.exp.estimated$SSE
## [1] 174953364109
As a rule of thumb, the ets models are not stationary, and are used
mostly when there is a trend and/or seasonality in the data, as this
model explicitly these components.
On the other hand, ARIMA models should be transformed in stationary and
should be used when if you see autocorrelation in the data, i.e. the
past data explains the present data well
miarma=auto.arima(Ita_ts)
summary(miarma)
## Series: Ita_ts
## ARIMA(1,0,1)(0,1,2)[12]
##
## Coefficients:
## ar1 ma1 sma1 sma2
## 0.9086 -0.4370 -0.6367 -0.0959
## s.e. 0.0294 0.0639 0.0525 0.0468
##
## sigma^2 = 430785910: log likelihood = -4523.53
## AIC=9057.05 AICc=9057.21 BIC=9076.99
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -1040.454 20346.37 13359.8 -6.521588 14.34121 0.624091
## ACF1
## Training set -0.000608621
checkresiduals(miarma)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,1)(0,1,2)[12]
## Q* = 54.639, df = 20, p-value = 4.648e-05
##
## Model df: 4. Total lags used: 24
We have a MAPE of 14.5 %, not bad, but hopefully we can do better.
Test of normality for the residuals
shapiro.test(miarma$residuals)
##
## Shapiro-Wilk normality test
##
## data: miarma$residuals
## W = 0.89232, p-value < 2.2e-16
We reject the hypotesis of normality of the residuals, this can generate problems in the forecast.
Plot the residuals to check for autocorrelation
plot(miarma$residuals)
acf(miarma$residuals)
No autocorrelation
Let’s make forecast
l.forecast <- forecast(miarma)
plot(l.forecast)
miarma2=auto.arima(log(Ita_ts))
summary(miarma2)
## Series: log(Ita_ts)
## ARIMA(1,0,2)(0,1,1)[12]
##
## Coefficients:
## ar1 ma1 ma2 sma1
## 0.9572 -0.3204 -0.4168 -0.8365
## s.e. 0.0272 0.0586 0.0593 0.0287
##
## sigma^2 = 0.04467: log likelihood = 48.81
## AIC=-87.61 AICc=-87.46 BIC=-67.68
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.009564315 0.2071882 0.1012619 -0.1103667 0.8768409 0.6658212
## ACF1
## Training set 0.005235695
checkresiduals(miarma2)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,2)(0,1,1)[12]
## Q* = 23.183, df = 20, p-value = 0.2799
##
## Model df: 4. Total lags used: 24
Test of normality for the residuals
shapiro.test(miarma2$residuals)
##
## Shapiro-Wilk normality test
##
## data: miarma2$residuals
## W = 0.57884, p-value < 2.2e-16
Plot the residuals to check for autocorrelation
plot(miarma2$residuals)
acf(miarma2$residuals)
Let’s make forecast
l.forecast2 <- forecast(miarma2)
plot(l.forecast2)
Try to create a VAR model using all of the nations
data2 <- data[-(1:12),]
data_ts<- ts(data2[,c(6,7,11,12,14,19)], start = c(1990, 1),
end = c(2024, 2), frequency =12)
sum(is.na(data_ts))
## [1] 0
plot(data_ts)
Var_model <- VAR(data_ts, lag.max =12)
Chosen order:
Var_model$p
## AIC(n)
## 12
Each model is like an lm object with its own coefficients and residuals:
residuals(Var_model$varresult$Italy)
## 1 2 3 4 5
## 14465.23012 -23298.57929 15366.43305 48270.73924 -12469.59035
## 6 7 8 9 10
## 25378.30384 11861.83902 -57154.36630 -6692.73930 10199.88264
## 11 12 13 14 15
## 8455.04746 27565.93716 -13695.89612 -23261.69942 -34964.18503
## 16 17 18 19 20
## -27743.64909 -15454.94667 -33938.28053 -22940.03040 -6828.76389
## 21 22 23 24 25
## -3970.65990 -20121.02044 -4780.27699 -11255.02194 18146.41698
## 26 27 28 29 30
## -25576.06949 12216.03294 -23039.52080 11603.71323 -2401.24375
## 31 32 33 34 35
## -16175.54538 11215.03399 -10015.98924 15397.61866 7533.18325
## 36 37 38 39 40
## -2107.44290 18225.12656 -24643.44477 4379.22263 -9615.34337
## 41 42 43 44 45
## 18549.76311 4129.15175 -12620.80415 8488.74366 -11615.27879
## 46 47 48 49 50
## -7962.00573 33763.45425 -3544.00682 4560.04699 -11461.19826
## 51 52 53 54 55
## 1115.16090 -2456.79541 1070.44404 4107.51233 -22772.95711
## 56 57 58 59 60
## 2871.50126 4218.20340 15068.39864 -3016.12502 -141.60103
## 61 62 63 64 65
## -1530.01172 26602.97987 16651.00207 26085.92403 31873.23750
## 66 67 68 69 70
## 14216.36736 30429.52304 -24506.02082 57843.92940 34107.49615
## 71 72 73 74 75
## 13943.01319 -715.32542 33323.47960 6284.95900 -17118.91237
## 76 77 78 79 80
## 16350.98984 -5567.25906 -11653.44449 35045.11080 -12401.93070
## 81 82 83 84 85
## -13082.57243 -13333.77837 8869.82017 -3909.88376 -11190.78532
## 86 87 88 89 90
## 18516.77324 24844.25367 -4825.78514 5931.86453 8934.23552
## 91 92 93 94 95
## -4550.28599 -12263.06893 -17226.40709 1869.71977 4748.93793
## 96 97 98 99 100
## -26916.23191 58068.26619 37504.68814 12008.46797 -10485.72070
## 101 102 103 104 105
## -1865.39476 -31573.42991 16382.35622 25575.62359 -8663.75120
## 106 107 108 109 110
## -6290.05140 -21286.37488 281.13721 53126.87033 2766.36587
## 111 112 113 114 115
## 17411.07105 1420.43488 -8718.85341 2409.47475 -15317.23852
## 116 117 118 119 120
## -6171.86210 -25245.88683 27456.05929 -3369.20137 -11630.11545
## 121 122 123 124 125
## 37069.33744 -27880.04464 -15620.96824 -10865.86010 -1505.84720
## 126 127 128 129 130
## -7251.29637 -4006.97372 -1859.23265 -2307.54412 6988.43567
## 131 132 133 134 135
## 7986.44158 55801.51506 -35352.35439 8739.51063 40325.71962
## 136 137 138 139 140
## -49708.85532 -12454.82025 9307.20757 39083.62390 -25381.42561
## 141 142 143 144 145
## 25073.93121 20841.95623 -13865.46087 -14329.51298 22720.93823
## 146 147 148 149 150
## 19243.04749 -7079.86529 888.47921 -513.82435 -10663.32265
## 151 152 153 154 155
## -1981.15622 -18508.04329 7537.99416 -15705.48308 3221.23815
## 156 157 158 159 160
## 3809.21178 -7348.39590 9258.75889 -17525.35426 7460.25279
## 161 162 163 164 165
## -52354.26961 53638.64585 6172.87216 1876.50989 6415.98240
## 166 167 168 169 170
## -18510.69213 19327.40714 -20382.03669 37684.40101 5298.30737
## 171 172 173 174 175
## 2151.48012 -19983.21555 27515.44326 -11250.56492 -24136.81664
## 176 177 178 179 180
## 17180.09187 -6172.28318 15499.05211 7005.86087 -10016.58883
## 181 182 183 184 185
## 38876.16493 -1750.11897 -6512.43955 10416.33663 16932.57279
## 186 187 188 189 190
## -12655.59547 -11199.14793 8082.92064 -8946.07056 20181.00617
## 191 192 193 194 195
## 8208.74670 10950.18152 -729.94594 -4090.46239 -29381.17940
## 196 197 198 199 200
## 15265.80052 -11207.99484 -24271.28923 4948.88694 5090.69665
## 201 202 203 204 205
## 104.26908 -44734.05557 -10658.87387 16114.44678 -72653.95689
## 206 207 208 209 210
## 4363.56473 2474.33756 8889.85839 -5632.67846 2283.71091
## 211 212 213 214 215
## 12149.61311 -33290.47547 17015.54787 22227.61780 12043.61300
## 216 217 218 219 220
## -5528.52372 -6937.69101 18670.31345 18435.97049 -39517.11310
## 221 222 223 224 225
## -26719.17915 -6195.36606 -8671.54938 8127.42620 6927.21362
## 226 227 228 229 230
## -5503.74148 7229.60047 -2690.96038 -17320.57689 -2136.36381
## 231 232 233 234 235
## -3160.05302 16512.85034 -7566.62836 -6614.84395 -9418.68955
## 236 237 238 239 240
## -1257.29703 6967.89859 -7222.92312 1274.26984 -8110.35400
## 241 242 243 244 245
## -18584.61591 -8296.75620 -23253.95100 17059.10204 211.69738
## 246 247 248 249 250
## -14163.31467 -16187.55758 -13574.21051 -19455.70778 12327.32145
## 251 252 253 254 255
## -56477.64780 -14843.92534 -3172.28067 -20834.26169 -13465.80844
## 256 257 258 259 260
## 15456.49259 5333.06237 -6475.52370 -9340.25265 -3161.21857
## 261 262 263 264 265
## 2576.43531 -2095.11680 -4286.38272 -7289.66959 -14900.73550
## 266 267 268 269 270
## -2360.89663 772.37779 7446.11078 -2313.82287 -2334.37774
## 271 272 273 274 275
## -2437.07270 -10839.47838 -5611.86570 -1456.59583 -3252.86260
## 276 277 278 279 280
## -5496.22719 -8316.71468 5045.42356 8830.44440 16145.44974
## 281 282 283 284 285
## -3760.95462 -26.42791 6233.24086 -21565.12575 7048.53906
## 286 287 288 289 290
## -1773.11070 12845.37005 -11805.12682 3341.29585 33201.90067
## 291 292 293 294 295
## 7886.18394 11501.57317 9604.94370 -2278.59498 -12501.87219
## 296 297 298 299 300
## -18086.47418 24815.99227 1913.27795 -533.75799 -11366.54551
## 301 302 303 304 305
## 4517.00621 24071.11152 26701.71962 -16344.06040 7853.96795
## 306 307 308 309 310
## 4715.67279 -21061.79336 -17606.40782 8555.58234 9840.27451
## 311 312 313 314 315
## -5357.45033 -18080.76038 9281.08662 7983.78928 18092.63087
## 316 317 318 319 320
## 9255.07136 -613.23540 -498.79603 -31261.37040 -6713.55897
## 321 322 323 324 325
## 19144.23595 15055.55014 -32011.34399 -12669.77883 -21382.78683
## 326 327 328 329 330
## 22412.48887 6896.67715 3282.29425 13742.80559 -2647.71032
## 331 332 333 334 335
## -10345.15372 -28751.64789 -3912.15970 10855.81820 -9376.81540
## 336 337 338 339 340
## 11471.75290 -31104.18548 -16106.71982 -118791.09917 -79535.54967
## 341 342 343 344 345
## -332.15172 13579.52574 31285.15531 -13950.86367 18920.15158
## 346 347 348 349 350
## 12125.24577 26788.52670 39163.13996 -7147.95540 -17680.52127
## 351 352 353 354 355
## 10243.05846 20224.09304 2195.70971 11312.68927 -5994.66638
## 356 357 358 359 360
## -14872.16731 -5034.14286 -20707.69491 2038.65231 -14084.53968
## 361 362 363 364 365
## -21634.44694 -11986.85619 -23832.67048 -6654.49920 10464.75674
## 366 367 368 369 370
## 2324.46452 1282.82534 -15729.99099 -4679.67323 10487.13877
## 371 372 373 374 375
## 11879.55396 9090.02012 36295.30831 2019.06342 6075.44825
## 376 377 378 379 380
## 6653.37382 6398.62996 -11137.13833 -4135.92620 7258.20662
## 381 382 383 384 385
## -1781.56185 16815.91598 5470.61890 -1912.31648 16326.80458
## 386 387 388 389 390
## -2300.88806 130480.72777 7291.50875 15127.16877 -27862.85749
## 391 392 393 394 395
## 1761.03518 59798.47937 21966.50101 -30145.94640 -21720.66222
## 396 397 398
## 26631.29985 -1689.64735 -2984.31158
Predictions
predict(Var_model)
## $Finland
## fcst lower upper CI
## [1,] 10736.157 7685.0269 13787.286 3051.130
## [2,] 7093.412 4003.0533 10183.770 3090.359
## [3,] 6607.925 3375.1330 9840.716 3232.792
## [4,] 7610.973 4221.8312 11000.116 3389.142
## [5,] 8460.372 5004.7314 11916.013 3455.641
## [6,] 5390.049 1879.8890 8900.209 3510.160
## [7,] 4161.065 584.6641 7737.466 3576.401
## [8,] 5665.090 2055.6534 9274.526 3609.436
## [9,] 6019.652 2352.2390 9687.065 3667.413
## [10,] 4930.802 1205.3865 8656.218 3725.416
##
## $Sweden
## fcst lower upper CI
## [1,] 11781.221 4705.5096 18856.93 7075.711
## [2,] 15393.869 8032.5651 22755.17 7361.304
## [3,] 18768.058 11024.4384 26511.68 7743.620
## [4,] 19388.452 11304.7219 27472.18 8083.731
## [5,] 15723.355 7548.0415 23898.67 8175.314
## [6,] 17921.210 9599.9347 26242.48 8321.275
## [7,] 13677.827 5031.7457 22323.91 8646.081
## [8,] 9548.097 733.8834 18362.31 8814.213
## [9,] 13387.102 4402.9062 22371.30 8984.196
## [10,] 15615.606 6346.4259 24884.79 9269.180
##
## $Italy
## fcst lower upper CI
## [1,] 269866.8 224805.17 314928.4 45061.60
## [2,] 223766.0 173797.55 273734.4 49968.41
## [3,] 213906.9 160446.12 267367.7 53460.77
## [4,] 206375.2 149892.31 262858.1 56482.91
## [5,] 206859.9 146491.99 267227.8 60367.91
## [6,] 188496.1 127518.46 249473.6 60977.59
## [7,] 184166.7 122347.90 245985.4 61818.77
## [8,] 101878.2 38708.09 165048.3 63170.13
## [9,] 153282.3 88783.99 217780.6 64498.31
## [10,] 182972.4 117749.19 248195.6 65223.22
##
## $Gerany
## fcst lower upper CI
## [1,] 282595.4 223431.5 341759.2 59163.82
## [2,] 269821.6 203438.4 336204.8 66383.21
## [3,] 364669.8 294270.3 435069.2 70399.42
## [4,] 371330.6 297413.2 445248.0 73917.40
## [5,] 329319.2 251344.4 407293.9 77974.71
## [6,] 357624.4 277169.7 438079.2 80454.75
## [7,] 350464.5 267647.9 433281.1 82816.55
## [8,] 230799.3 145771.5 315827.1 85027.80
## [9,] 264384.0 178789.2 349978.8 85594.81
## [10,] 277144.4 191174.4 363114.5 85970.07
##
## $Norway
## fcst lower upper CI
## [1,] 5303.020 703.0957 9902.945 4599.925
## [2,] 6860.692 2126.7441 11594.639 4733.947
## [3,] 7891.515 3058.4923 12724.538 4833.023
## [4,] 7727.703 2622.9608 12832.445 5104.742
## [5,] 9261.967 4057.7652 14466.170 5204.202
## [6,] 8154.561 2864.5390 13444.582 5290.022
## [7,] 7764.193 2145.0083 13383.379 5619.185
## [8,] 7696.153 2007.5523 13384.753 5688.600
## [9,] 8288.532 2515.7485 14061.316 5772.784
## [10,] 7572.705 1625.5735 13519.837 5947.132
##
## $UK
## fcst lower upper CI
## [1,] 155775.01 77632.63 233917.4 78142.38
## [2,] 103776.83 24326.75 183226.9 79450.08
## [3,] 157615.50 75949.20 239281.8 81666.29
## [4,] 157845.92 75653.31 240038.5 82192.60
## [5,] 103564.24 21174.71 185953.8 82389.53
## [6,] 94640.71 11858.75 177422.7 82781.96
## [7,] 36202.67 -49034.46 121439.8 85237.13
## [8,] 319697.05 233799.93 405594.2 85897.11
## [9,] 124432.47 37439.92 211425.0 86992.54
## [10,] 125317.53 38156.14 212478.9 87161.38
The predictions are not that good, we were expecting that: the residuals are high and coefficients low. So: The countries cannot explain each others’ behavior.
#ARIMAX
#Import time series about gas prices
carburanti<-read.csv("Carburanti.csv", header= TRUE)
Benzina_ts <- ts(carburanti$Benzina, start = c(1996, 1),
end = c(2024, 2), frequency =12)
plot(Benzina_ts)
plot(diff(Benzina_ts))
#To use ARIMAX we need both series to be stationary
adf.test(Ita_ts)
##
## Augmented Dickey-Fuller Test
##
## data: Ita_ts
## Dickey-Fuller = -3.3571, Lag order = 7, p-value = 0.06132
## alternative hypothesis: stationary
adf.test(Benzina_ts)
##
## Augmented Dickey-Fuller Test
##
## data: Benzina_ts
## Dickey-Fuller = -2.4064, Lag order = 6, p-value = 0.4052
## alternative hypothesis: stationary
#Create the same window to have the series at the same lenght
Ita_ts_modify <- window(Ita_ts, start=c(1996,1))
fit2 <- auto.arima(Ita_ts_modify, xreg=Benzina_ts)
summary(fit2)
## Series: Ita_ts_modify
## Regression with ARIMA(1,0,1)(1,1,2)[12] errors
##
## Coefficients:
## ar1 ma1 sar1 sma1 sma2 drift xreg
## 0.8777 -0.3683 0.0729 -0.7431 -0.0149 -163.5631 -1.5490
## s.e. 0.0435 0.0791 0.3789 0.3747 0.2715 154.3512 24.4844
##
## sigma^2 = 425917277: log likelihood = -3702.49
## AIC=7420.97 AICc=7421.43 BIC=7451.27
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 544.2815 20049.32 13225.33 -6.215407 14.76927 0.6096376 0.01099098
checkresiduals(fit2)
##
## Ljung-Box test
##
## data: Residuals from Regression with ARIMA(1,0,1)(1,1,2)[12] errors
## Q* = 40.53, df = 19, p-value = 0.002786
##
## Model df: 5. Total lags used: 24
We try with to differentiate Gasoline price and fit it into the ARIMA model
d_Benzina_ts<-diff(Benzina_ts)
adf.test(d_Benzina_ts)
## Warning in adf.test(d_Benzina_ts): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: d_Benzina_ts
## Dickey-Fuller = -7.6558, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
#Change the window by one month (we lose an obs because we differentiate)
Ita_ts_modify2 <- window(Ita_ts, start=c(1996,2))
#Fit the model
fit <- auto.arima(Ita_ts_modify2, xreg=d_Benzina_ts)
summary(fit)
## Series: Ita_ts_modify2
## Regression with ARIMA(1,0,1)(1,1,2)[12] errors
##
## Coefficients:
## ar1 ma1 sar1 sma1 sma2 drift xreg
## 0.8828 -0.3799 0.0725 -0.7390 -0.0194 -162.4691 17.2047
## s.e. 0.0397 0.0777 0.3669 0.3633 0.2634 145.9534 25.0419
##
## sigma^2 = 426540455: log likelihood = -3691.36
## AIC=7398.71 AICc=7399.17 BIC=7428.98
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 506.6749 20062.21 13264.28 -6.231399 14.81549 0.6100593 0.01050806
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from Regression with ARIMA(1,0,1)(1,1,2)[12] errors
## Q* = 40.7, df = 19, p-value = 0.002645
##
## Model df: 5. Total lags used: 24
We create dummy variable to handle the structural break of Covid
dates <- as.Date(time(Ita_ts_modify), origin = "1970-01-01")
# Creation of time series with a dummy variable
dummy <- ifelse(dates == as.Date("2020-03-01"), 1, 0)
dummy_ts <- ts(dummy, start = c(1996, 1), frequency = 12)
Then we merge in a unique object ts to use it as a regressor
x_ts <- cbind(dummy_ts, Benzina_ts)
fit <- auto.arima(Ita_ts_modify, xreg= x_ts)
summary(fit)
## Series: Ita_ts_modify
## Regression with ARIMA(1,0,1)(1,1,2)[12] errors
##
## Coefficients:
## ar1 ma1 sar1 sma1 sma2 drift dummy_ts
## 0.9189 -0.4645 0.0162 -0.6894 -0.0432 -149.0849 -108737.98
## s.e. 0.0338 0.0720 0.5078 0.5046 0.3617 181.4630 16704.03
## Benzina_ts
## 3.1875
## s.e. 22.0011
##
## sigma^2 = 376135288: log likelihood = -3681.52
## AIC=7381.04 AICc=7381.61 BIC=7415.13
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 384.3803 18811.67 12774.73 -9.269846 17.47593 0.5888668
## ACF1
## Training set 0.005912887
I don’t think things are better than the normal ARIMA
#let’s take again the various countries
data_ts <-window (data_ts, start=c(1990,1))
Ita_ts2 <- window (Ita_ts, start=c(1990,1))
fit <- auto.arima(Ita_ts2, xreg= data_ts)
The MAPE is reduced.